Chapter 3: Types

Julia's Type System

A type system describes a programming language's way of handling individual pieces of data and determining how to operate on them based on their type.

Julia's type system is primarily dynamic, meaning that there is no need to tell Julia what type a particular value is. This is useful, in that you can write fairly complex applications without ever needing to specify types. You might, then, be tempted to disregard types as an advanced feature that you cannot be bothered right now. However, a good understanding of types is extremely helpful to mastering a functional language.

Julia's dynamic system is augmented by the ability to specify types where needed. This has two advantages. First, type specification leads to more efficient code. It will make your code more stable, much faster and much more robust. At the same time, unlike in a statically typed language, you do not need to get your head around types at the very beginning. Thus, you can treat this chapter not so much as a tutorial exercise but as a reference you can come back to every now and then.

It's important to understand that in Julia, types belong to values, not variables. It's also important to understand the hierarchy of types in Julia; they may be abstract or concrete, where abstract is the highest level of a type, and concrete is the lowest level:

Abstract Types:: You can think of an abstract type as a "family" of types intended solely to act as a supertype of other types; it is not a type in and of itself. Therefore, an object without a type cannot have an abstract type (it has no family), but a group of similar types could be grouped into an abstract type (ex. Int8, Int16, and Int64 can all be astracted into the Int type). Any is the default supertype of any object you create.

We can see the supertype of a type using the super() function:


In [94]:
println("""
 8 is a $(super(typeof(8))) type, 
  ↘ a subset of:
    $(super(super(typeof(8)))) types  
     ↘ which is a subset of: 
       $(super((super(super(typeof(8)))))) types
        ↘ which is a subset of: 
          $(super(super((super(super(typeof(8))))))) types
           ↘ which is a subset of: 
             $(super(super(super((super(super(typeof(8)))))))), the final abstract type.
""")


 8 is a Signed type, 
  ↘ a subset of:
    Integer types  
     ↘ which is a subset of: 
       Real types
        ↘ which is a subset of: 
          Number types
           ↘ which is a subset of: 
             Any, the final abstract type.

Concrete Types: These types are intended to be the types of actual objects (Int8, Float16, etc.) and they are always subtypes of abstract types. Conrete types can therefore not have any more subtypes, and you can't create types for values that that don't have supertypes. Here is useful to mention an interesting property of Julia's type system: any two types always have a common ancestor type (typically Any).


In [105]:
println("""
8 is a $(typeof(8)), one of the Integer Subtypes: $(subtypes(Integer)).
""")


8 is a Int64, one of the Integer Subtypes: Any[BigInt,Bool,Signed,Unsigned].

If this feels incredibly convoluted, just bear with it for now. It will make more sense once you get around to its practical implementations.

Declaring and Testing Types

Julia's primary type operator is :: (double-colons). It has three different uses, all fairly important, and it's crucial that you understand the different functions that :: fulfills in the different contexts.

Use 1: Declaring a (sub)type

In the context of a statement, such as a function, :: appended to a variable means 'this variable is always to be of this type'.

In the following, we will create a function that returns 32 as Int8 (for now, let's ignore that we don't know much about functions and we don't quite know what integer types exist – these will all be explained shortly!).


In [2]:
function restrict_this_integer()
    x::Int8 = 32
    x
    end


Out[2]:
restrict_this_integer (generic function with 1 method)

In [3]:
p = restrict_this_integer()


Out[3]:
32

In [4]:
typeof(p)


Out[4]:
Int8

As we can see, the :: within the function had the effect that the returned result would be represented as an 8-bit integer (Int8). Recall that this only works in the context of a statement – thus simply entering x::Int8 will yield a typeassert error, telling us that we have provided an integer literal, which Julia understands by default to be an Int64, to be assigned to a variable and shaped as anInt8` – which clearly doesn't work.

Use 2: Asserting a type

In every other context, :: means 'I assert this value is of this particular type'. This is a great way to check a value for both abstract and concrete type.

For instance, you are provided a variable input_from_user. How do you make sure it has the right kind of value?


In [5]:
input_from_user = 128


Out[5]:
128

In [106]:
input_from_user::Int # Is the input_from_user variable in the Int family?


Out[106]:
128

In [107]:
input_from_user::Char # Is the input_from_user variable in the Char family?


LoadError: TypeError: typeassert: expected Char, got Int64
while loading In[107], in expression starting on line 1

As you can see, if you specify the correct abstract type, you get the value returned, whereas in our second assertion, where we asserted that the value was of the type Char (used to store individual characters), we got a typeassert error, which we can catch later on and return to ensure that we get the right type of value.

Remember, every Float64 (a concrete type) is also an AbstractFloat (an abstract type), so asking if the Type is an AbstractFloat will always be valid for any BigFloat, Float16, Float32, or Float64 value:


In [45]:
val = rand()
val::AbstractFloat


Out[45]:
0.07512372464596617

However, asserting a different concrete type, such as Int32, will yield a typeassert error, since the input_from_user::Int64 will also yield 128, while .


In [46]:
typeof(val)


Out[46]:
Float64

In [47]:
val::Float32


LoadError: TypeError: typeassert: expected Float32, got Float64
while loading In[47], in expression starting on line 1

Use 3: Specifying acceptable function inputs

While we have not really discussed function inputs, you should be familiar with the general idea of a function – values go in, results go out. In Julia, you have the possibility to make sure your function only accepts values that you want it to. Consider creating a function that adds up only floating point numbers:


In [108]:
function addition(x::Float64, y::Float64)
    x + y
    end


Out[108]:
addition (generic function with 1 method)

Calling it on two floating-point numbers will, of course, yield the expected result:

addition(3.14, 2.71)

But giving it a simpler task will raise an error:


In [109]:
addition(1, 1)


LoadError: MethodError: `addition` has no method matching addition(::Int64, ::Int64)
while loading In[109], in expression starting on line 1

from Any to Int What the error complaining about the lack of a method matching addition(::Int64) means is that Julia cannot find a definition for the name addition that would accept an Int64 value.

The real meaning of this error is a little complex, and refers to one of the base features of Julia called multiple dispatch.

The simple version is that there are different ways to compute something based on its types (ex. Float64 addition is a different to Int64 addition), so defining these types ahead of time improve Julia's JIT Compiler's performance (which it inherits from C's LLVM), because it has to return fewer kinds of addition, instead of every kind of addition.

However, in Julia, you can create multiple functions with the same name that process different types of inputs, so e.g. an add() function can add up Int and Float inputs but concatenate String type inputs. Multiple Dispatch effectively creates a table for every possible type for which the function is defined and looks up the right function at call time (so you can use both abstract and concrete types without a performance penalty).

However, making the type more concrete effectively reduces the size of the table. Here's the table loaded for Any type:


In [187]:
subtypes(Any) # Every Subtype


Out[187]:
238-element Array{Any,1}:
 AbstractArray{T,N}                        
 AbstractChannel                           
 AbstractRNG                               
 AbstractString                            
 Any                                       
 Associative{K,V}                          
 Base.AbstractCmd                          
 Base.AbstractMsg                          
 Base.AbstractZipIterator                  
 Base.Cartesian.LReplace{S<:AbstractString}
 Base.Combinations{T}                      
 Base.Count{S<:Number}                     
 Base.Cycle{I}                             
 ⋮                                         
 TypeVar                                   
 Type{T}                                   
 UniformScaling{T<:Number}                 
 Val{T}                                    
 Vararg{T}                                 
 VersionNumber                             
 Void                                      
 WeakRef                                   
 WorkerConfig                              
 ZMQ.Context                               
 ZMQ.MsgPadding                            
 ZMQ.Socket                                

All of which have their own subtypes. Thefefore, reducing this table of possible types down can decrease the computational time and memory required at compliation significantly.

Consider the following multiplication functions with different types.


In [17]:
function any_multiplication(x, y) # Addition with Any types
    x * y
    end

function int_multiplication(x::Int, y::Int) # Multiplication with Int types
    x * y
    end

function int64_multiplication(x::Int64, y::Int64) # Multiplication with Int64 types
    x * y
    end


Out[17]:
int64_multiplication (generic function with 1 method)

This is their runtime in seconds and memory in bytes, as provided by the magic Jupyter command @time:


In [18]:
@time any_multiplication(1,1)   #Any on Int64
@time int_multiplication(1,1)   #Int on Int64
@time int64_multiplication(1,1) #Int64 on Int64


  0.001754 seconds (302 allocations: 16.047 KB)
  0.001272 seconds (302 allocations: 16.047 KB)
  0.001095 seconds (302 allocations: 16.047 KB)
Out[18]:
1

Note that another very powerful 'under-the-hood' feature of Julia is it's ability to recall a Type based function after it has compiled the first time and keep it in memory in case to be used again, which significantly improves its performance.

So recalling the function on a Type you've already called it on does not require you to recompile it:


In [19]:
@time any_multiplication(1,1)   #Any on Int64 again
@time int_multiplication(1,1)   #Int on Int64 again
@time int64_multiplication(1,1) #Int64 on Int64 again


  0.000003 seconds (4 allocations: 160 bytes)
  0.000003 seconds (4 allocations: 160 bytes)
  0.000002 seconds (4 allocations: 160 bytes)
Out[19]:
1

If we run any_multiplication() on a String value, a new function must be compiled again from the map of all possible Types, costing computational time again.

Side note,, multiplication on strings in Julia is concatenation (equivalent to a + b in Python).


In [20]:
@time any_multiplication("hello","world")   #Any on String


  0.001547 seconds (447 allocations: 21.350 KB)
Out[20]:
"helloworld"

Getting the type of a value

To obtain the type of a value, use the typeof() function:


In [21]:
typeof(32)


Out[21]:
Int64

typeof() is notable for treating tuples differently from most other collections.

Calling typeof() on a tuple enumerates the types of each element, whereas calling it on, say, an Array value returns the Array notation of type (which looks for the largest common type among the values, up to Any):


In [34]:
typeof([1, 2, "a"]) # Array - Most common Type inside


Out[34]:
Array{Any,1}

In [35]:
typeof((1, 2, "a")) # Tuple - Every Types inside


Out[35]:
Tuple{Int64,Int64,ASCIIString}

Helpfully, the isa() function tells us whether something is a particular type:


In [36]:
isa("River", ASCIIString)


Out[36]:
true

And, of course, types have types (specifically, DataType)!


In [37]:
typeof("River")


Out[37]:
ASCIIString

In [39]:
typeof(ans)


Out[39]:
DataType

Exploring Type hierarchy

The <: operator can help you find out whether the left-side type is a subtype of the right-side type. Thus, we see that Int64 is a subtype of Int, but ASCIIString isn't!


In [61]:
Int64 <: Int


Out[61]:
true

In [62]:
ASCIIString <: Int


Out[62]:
false

To reveal the supertype of a type, use the super() function:


In [63]:
super(ASCIIString)


Out[63]:
DirectIndexString

Composite types

Composite types, known to C coders as structs, are more complex object structures that you can define to hold a set of values. For instance, to have a Type that would accommodate geographic coordinates, you would use a composite type. Composite types are created with the type keyword:


In [64]:
type GeoCoordinates
    lat::Float64
    lon::Float64
end

We can then create a new value with this type:


In [65]:
home = GeoCoordinates(51.7519, 1.2578)


Out[65]:
GeoCoordinates(51.7519,1.2578)

In [66]:
typeof(home)


Out[66]:
GeoCoordinates

The values of a composite object are, of course, accessible using the dot notation you might be used to from many other programming languages:


In [67]:
home.lat


Out[67]:
51.7519

In the same way, you can assign new values to it. However, these values have to comply with the type's definition in that they have to be convertible to the type specified (in our case, Float64).

So, for instance, an Int64 input would be acceptable, since you can convert an Int64 into a Float64 easily. On the other hand, an ASCIIString would not do, since you cannot convert it into an Int64.

Creating your very own immutable

An immutable type is one which, once instantiated, cannot be changed. They are created the same way as composite types, except by using the immutable keyword in lieu of type:


In [68]:
immutable ImmutableGeoCoordinates # 'type' becomes 'immutable'
    lat::Float64
    lon::Float64
end


LoadError: invalid redefinition of constant GeoCoordinates
while loading In[68], in expression starting on line 1

In [69]:
home = ImmutableGeoCoordinates(51.7519, 1.2578)


Out[69]:
ImmutableGeoCoordinates(51.7519,1.2578)

Once instantiated, you cannot change the values. So if we would instantiate the immutable ImmutableGeoCoordinates type with the values above, then attempt to change one of its values, we would get an error:


In [70]:
home.lat = 51.75


LoadError: type ImmutableGeoCoordinates is immutable
while loading In[70], in expression starting on line 1

Type unions

Sometimes, it's useful to have a single alias for multiple types. To do so, you can create a type union using the constructor Union{}:


In [1]:
Numeric = Union{Int, Float64}


Out[1]:
Union{Float64,Int64}

In [2]:
1::Numeric


Out[2]:
1

In [3]:
1.12::Numeric


Out[3]:
1.12

From start to finish: creating a custom type

When you hear LSD, you might be tempted of the groovy drug that turned the '70s weird. It also refers to one of the biggest problems of early computing in Britain – making computers make sense of Britain's odd pre-decimal currency system before it was abandoned in 1971. Under this system, there were 20 shillings (s) in a pound (£ or L) and twelve pence (d) in a shilling (so, 240 pence in a pound). This made electronic book-keeping in its earliest era in Britain rather difficult. Let's see how Julia would solve the problem.

Type definition

First of all, we need a type definition. We also know that this would be a composite type, since we want it to hold three values (known in this context as 'fields') - one for each of pounds, shillings and pence. We also know that these would have to be integers.


In [4]:
type LSD             # Type Name (Mutable since Type not Immutable)
    pounds::Int      # LSD.pounds (Int Type)
    shillings::Int   # LSD.shillings (Int Type)
    pence::Int       # pence.pence (Int Type)
end

You don't strictly need to define types, but the narrower the types you define for fields when you create a new type, the faster compilation is going to be (as we explored above).

Thus, pounds::Int is faster than pounds, and pounds::Int64 is faster than pounds::Int. At any rate, avoid not defining any data types, which Julia will understand as referring to the global supertype ::Any, unless that indeed is what you want your field to embrace.

Constructor function

We have a good start, but not quite there yet.

Every type can have a constructor function, the function executed when a new instance of a type is created. This is sort of like a class, as defined in OOP.

A constructor function is inside the type definition and has the same name as the type:


In [5]:
function LSD(l,s,d)
    if l < 0 || s < 0 || d < 0 # || is equivalent to or
        error("No negative numbers, please! We're British!") # raises an exceptiion
    end
    if d > 12 || s > 20
        error("That's too many pence or shillings!")
    end
    new(l,s,d) # creates a new LSD type, with l, s, and d as Pounds, Shillings, and Pence values respectively
end


Out[5]:
LSD

Don't worry if this looks a little strange – since we haven't dealt with functions yet, most of this is going to be alien to you.

What the function LSD(l,s,d) does is to, first, test whether any of l, s or d are negative or whether there are more pence or shillings than there could be in a shilling or a pound, respectively. In both of these cases, it raises an error. If the values do comply, it creates the new instance of the LSD composite type using the new(l,s,d) keyword.

The full type definition can be wrapped up together with this constructer function. It should look like this:


In [6]:
type LSD
    pounds::Int
    shillings::Int
    pence::Int

    function LSD(l,s,d)
        if l < 0 || s < 0 || d < 0
            error("No negative numbers, please! We're British!")
        end
        if d > 12 || s > 20
            error("That's too many pence or shillings!")
        end
        new(l,s,d)
    end
end

Note: If you create a new class with the same name without restarting the Jupyter kernal/environment, it does not always overwrite the original function. You may need restart the kernal over to get the new consructer

As we can see, we can now create valid prices in the old LSD system:


In [7]:
biscuits = LSD(0,1,3)


Out[7]:
LSD(0,1,3)

And the constructor function makes sure we don't contravene the constraints we set up earlier


In [8]:
sausages = LSD(1,25,31)


Out[8]:
LSD(1,25,31)

In [6]:
national_debt = LSD(-1000000000,0,0)


Out[6]:
LSD(-1000000000,0,0)

We can, of course, use dot notation to access constituent values of the type, the names of which derive from the beginning of our definition:


In [15]:
biscuits.pence


Out[15]:
3

Type methods

Let's see how our new type deals with some simple maths:


In [12]:
biscuits = LSD(0,1,3)


LoadError: UndefVarError: new not defined
while loading In[12], in expression starting on line 1

 in LSD at In[10]:8

In [24]:
gravy = LSD(0,0,5)


Out[24]:
LSD(0,0,5)

In [25]:
biscuits + gravy


Out[25]:
LSD(0,1,8)

Ooops, that's not great. What the error message means is that the function + (addition) has no 'method' for two instances of type LSD (as you remember, :: is short for 'type of').

A 'method', in Julia, is a type-specific way for an operation or function to behave. As we will discuss it in detail later on, most functions and operators in Julia are actually shorthands for a bundle of multiple methods. Julia decides which of these to call given the input, a feature known as multiple dispatch.

So, for instance, + given the input ::Int means numerical addition, but something rather different for two Boolean values:


In [1]:
true + true


Out[1]:
2

In fact, + is the 'shorthand' for over a hundred methods. You can see all of these by calling methods() on +:


In [20]:
methods(+)


Out[20]:
171 methods for generic function +:

...and so on. What we need is there to be a method that accommodates the type LSD. We do that by creating a method of + for the type LSD.

Again, the function is less important here (it will be trivial after reading the chapter on Functions), what matters is the idea of creating a method to augment an existing function/operator to handle our new type:


In [2]:
function +{LSD}(a::LSD, b::LSD)
    newpence = a.pence + b.pence
    newshillings = a.shillings + b.shillings
    newpounds = a.pounds + b.pounds
    subtotal = newpence + newshillings * 12 + newpounds * 240
    (pounds, balance) = divrem(subtotal, 240)
    (shillings, pence) = divrem(balance, 12)
    LSD(pounds, shillings, pence)
end


WARNING: module Main should explicitly import + from Base
Out[2]:
+ (generic function with 172 methods)

When entering it in the REPL, Julia tells us that + now has one more method:

Indeed, methods(+) shows that the new method for two LSDs is registered:


In [3]:
methods(+) # one more + method


Out[3]:
172 methods for generic function +:

And now we know the price of biscuits and gravy:


In [4]:
biscuits + gravy


LoadError: UndefVarError: biscuits not defined
while loading In[4], in expression starting on line 1

Representation of types

Every type has a particular 'representation', which is what we encountered every time the REPL showed us the value of an object after entering an expression or a literal. It probably won't surprise you that representations are methods of the Base.show() function, and a new method to 'pretty-print' our LSD type (similar to creating a __repr__ or __str__ function in a Python class's declaration) can be created the same way:


In [5]:
function Base.show(io::IO, money::LSD)
    print(io, $(money.pounds), $(money.shillings)s, $(money.pence)d.")
end


LoadError: UndefVarError: LSD not defined
while loading In[5], in expression starting on line 1

Base.show has two arguments: the output channel, which we do not need to concern ourselves with, and the second argument, which is the value to be displayed. We declared a function that used the print() function to use the output channel on which Base.show() is called, and display the second argument, which is a string formatted version of the LSD object.

Our pretty-printing worked:


In [6]:
biscuits + gravy


LoadError: UndefVarError: biscuits not defined
while loading In[6], in expression starting on line 1

Our new type is looking quite good!

What next for LSD?

Of course, the LSD type is far from ready. We need to define a list of other methods, from subtraction to division, but the general concept ought to be clear. A new type is easy to create, but when doing so, you as a developer need to keep in mind what you and your users will do with this new type, and create methods accordingly. Chapter [X] will discuss methods in depth, but this introduction should help you think intelligently about creating new types.

Conclusion

In this chapter, we learned about the way Julia's type system is set up. The issue of types will be at the background of most of what we do in the future, so feel free to refer back to this chapter as frequently as you feel the need to. In the next chapter, we will be exploring collections, a category of types that share one important property – they all act as 'envelopes' for multiple elements, each with their distinct type.

Appendix: Julia Types Crib Sheet

This is a selection of Julia's type tree, omitting quite a few elements. To see the full thing, you can use Tanmay Mohapatra's julia_types.jl.

+- Any << abstract immutable size:0 >>
.  +- StaticVarInfo << concrete mutable size:24 >>
.  +- NotFound << concrete mutable pointerfree size:0 >>
.  +- Colon << concrete mutable pointerfree size:0 >>
.  +- MmapArrayInfo << concrete mutable size:24 >>
.  +- Exception << abstract immutable size:0 >>
.  .  +- ArgumentError << concrete mutable size:8 >>
.  .  +- TypeError << concrete mutable size:32 >>
.  .  +- SystemError << concrete mutable size:16 >>
.  .  +- EOFError << concrete mutable pointerfree size:0 >>
.  .  +- KeyError << concrete mutable size:8 >>
.  .  +- StackOverflowError << concrete mutable pointerfree size:0 >>
.  .  +- LoadError << concrete mutable size:24 >>
.  .  +- DisconnectException << concrete mutable pointerfree size:0 >>
.  .  +- InterruptException << concrete mutable pointerfree size:0 >>
.  .  +- DivideByZeroError << concrete mutable pointerfree size:0 >>
.  .  +- MemoryError << concrete mutable pointerfree size:0 >>
.  .  +- MethodError << concrete mutable size:16 >>
.  .  +- UndefRefError << concrete mutable pointerfree size:0 >>
.  .  +- ErrorException << concrete mutable size:8 >>
.  .  +- OverflowError << concrete mutable pointerfree size:0 >>
.  .  +- DomainError << concrete mutable pointerfree size:0 >>
.  .  +- InexactError << concrete mutable pointerfree size:0 >>
.  .  +- UVError << concrete mutable size:16 >>
.  .  +- ParseError << concrete mutable size:8 >>
.  .  +- BoundsError << concrete mutable pointerfree size:0 >>
.  +- ProcessChain << concrete mutable size:32 >>
.  .  +- ProcessChainOrNot = Union(Bool,ProcessChain)